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Abstract 

We present an implemented compila- 
tion algorithm that translates HPSG 
into lexicalized feature-based TAG, 
relating concepts of the two theo- 
ries. While HPSG has a more elab- 
orated principle-based theory of pos- 
sible phrase structures, TAG provides 
the means to represent lexicalized 
structures more explicitly. Our objec- 
tives are met by giving clear defini- 
tions that determine the projection of 
structures from the lexicon, and iden- 
tify "maximal" projections, auxiliary 
trees and foot nodes. 

1 Introduction 

Head Driven Phrase Structure Grammar 
(HPSG) and Tree Adjoining Grammar (TAG) 
are two frameworks which so far have been 
largely pursued in parallel, taking little or no 
account of each other. In this paper we will de- 
scribe an algorithm which will compile HPSG 
grammars, obeying certain constraints, into 
TAGs. However, we are not only interested in 
mapping one formalism into another, but also 
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in exploring the relationship between concepts 
employed in the two frameworks. 

HPSG is a feature-based grammatical frame- 
work which is characterized by a modular speci- 
fication of linguistic generalizations through ex- 
tensive use of principles and lexicalization of 
grammatical information. Traditional gram- 
mar rules are generalized to schemata provid- 
ing an abstract definition of grammatical rela- 
tions, such as head-of, complement-of, subject- 
of, adjunct-of, etc. Principles, such as the Head- 
Feature-, Valence-, Non-Local- or Semantics- 
Principle, determine the projection of informa- 
tion from the lexicon and recursively define 
the flow of information in a global structure. 
Through this modular design, grammatical de- 
scriptions are broken down into minimal struc- 
tural units referring to local trees of depth one, 
jointly constraining the set of well- formed sen- 
tences. 

In HPSG, based on the concept of "head- 
domains" , local relations (such as complement- 
of, adjunct-of) are defined as those that are re- 
alized within the domain defined by the syntac- 
tic head. This domain is usually the maximal 
projection of the head, but it may be further ex- 
tended in some cases, such as raising construc- 
tions. In contrast, filler-gap relations are con- 
sidered non-local. This local vs. non-local dis- 
tinction in HPSG cuts across the relations that 
are localized in TAG via the domains defined 
by elementary trees. Each elementary tree typ- 
ically represents all of the arguments that are 
dependent on a lexical functor. For example, 
the complement-of and filler-gap relations are 
localized in TAG, whereas the adjunct-of rela- 



tion is not. 

Thus, there is a fundamental distinction be- 
tween the different notions of localization that 
have been assumed in the two frameworks. If, at 
first sight, these frameworks seem to involve a 
radically different organization of grammatical 
relations, it is natural to question whether it is 
possible to compile one into the other in a man- 
ner faithful to both, and more importantly, why 
this compilation is being explored at all. We 
believe that by combining the two approaches 
both frameworks will profit. 

From the HPSG perspective, this compilation 
offers the potential to improve processing effi- 
ciency. HPSG is a "lexicalist" framework, in the 
sense that the lexicon contains the information 
that determines which specific categories can 
be combined. However, most HPSG grammars 
are not lexicalized in the stronger sense defined 
by Schabes et.al. ( [SAJ88 ), where lexicalization 
means that each elementary structure in the 
grammar is anchored by some lexical item. For 
example, HPSG typically assumes a rule schema 
which combines a subject phrase (e.g. NP) with 
a head phrase (e.g. VP), neither of which is 
a lexical item. Consider a sentence involving 
a transitive verb which is derived by applying 
two rule schemata, reducing first the object and 
then the subject. In a standard HPSG deriva- 
tion, once the head verb has been retrieved, it 
must be computed that these two rules (and 
no other rules) are applicable, and then infor- 
mation about the complement and subject con- 
stituents is projected from the lexicon accord- 
ing to the constraints on each rule schema. On 
the other hand, in a lexicalized TAG derivation, 
a tree structure corresponding to the combined 
instantiation of these two rule schemata is di- 
rectly retrieved along with the lexical item for 
the verb. Therefore, a procedure that compiles 
HPSG to TAG can be seen as performing signifi- 
cant portions of an HPSG derivation at compile- 
time, so that the structures projected from lexi- 
cal items do not need to be derived at run-time. 
The compilation to TAG provides a way of pro- 
ducing a strongly lexicalized grammar which is 
equivalent to the original HPSG, and we expect 
this lexicalization to yield a computational ben- 



efit in parsing (cf. ( SJ90| )). 

This compilation strategy also raises several 
issues of theoretical interest. While TAG be- 
longs to a class of mildly context-sensitive gram- 



mar formalisms (JVW91), the generative ca- 
pacity of the formalism underlying HPSG (viz., 
recursive constraints over typed feature struc- 
tures) is unconstrained, allowing any recursively 
enumerable language to be described. In HPSG 
the constraints necessary to characterize the 
class of natural languages are stated within a 
very expressive formalism, rather than built 
into the definition of a more restrictive formal- 
ism, such as TAG. Given the greater expressive 
power of the HPSG formalism, it will not be 
possible to compile an aribitrary HPSG gram- 
mar into a TAG grammar. However, our com- 
pilation algorithm shows that particular HPSG 
grammars may contain constraints which have 
the effect of limiting the generative capacity 
to that of a mildly context-sensitive language.^] 
Additionally, our work provides a new perspec- 
tive on the different types of constituent com- 
bination in HPSG, enabling a classification of 
schemata and principles in terms of more ab- 
stract functor-argument relations. 

From a TAG perspective, using concepts em- 
ployed in the HPSG framework, we provide an 
explicit method of determining the content of 
the elementary trees (e.g., what to project from 
lexical items and when to stop the projection) 
from an HPSG source specification. This also 
provides a method for deriving the distinctions 
between initial and auxiliary trees, including the 
identification of foot nodes in auxiliary trees. 
Our answers, while consistent with basic tenets 
of traditional TAG analyses, are general enough 
to allow an alternate linguistic theory, such as 
HPSG, to be used as a basis for deriving a TAG. 
In this manner, our work also serves to investi- 
gate the utility of the TAG framework itself as a 
means of expressing different linguistic theories 
and intuitions. 

In the following we will first briefly describe 



1 We are only considering a syntactic fragment of 
HPSG here. It is not clear whether the semantic com- 
ponents of HPSG can also be compiled into a more con- 
strained formalism. 



the basic constraints we assume for the HPSG 
input grammar and the resulting form of TAG. 
Next we describe the essential algorithm that 
determines the projection of trees from the lex- 
icon, and give formal definitions of auxiliary tree 
and foot node. We then show how the com- 
putation of "sub-maximal" projections can be 
triggered and carried out in a two-phase compi- 
lation. 

2 Background 

As the target of our translation we assume a 
Lexicalized Tree- Adjoining Grammar (LTAG), 
in which every elementary tree is anchored by a 
lexical item ( |SAJ8S| ). 

We do not assume atomic labelling of nodes, 
unlike traditional TAG, where the root and foot 
nodes of an auxiliary tree are assumed to be 
labelled identically. Such trees are said to factor 
out recursion. However, this identity itself isn't 
sufficient to identify foot nodes, as more than 
one frontier node may be labelled the same as 
the root. Without such atomic labels in HPSG, 
we are forced to address this issue, and present 
a solution that is still consistent with the notion 
of factoring recursion. 

Our translation process yields a lexicalized 
feature-based TAG ( |VSJ8§1 ) in which feature 
structures are associated with nodes in the fron- 
tier of trees and two feature structures (top and 
bottom) with nodes in the interior. Follow- 
ing ( |VS92D , the relationships between such top 
and bottom feature structures represent under- 
specified domination links. Two nodes standing 
in this domination relation could become the 
same, but they are necessarily distinct if ad- 
joining takes place. Adjoining separates them 
by introducing the path from the root to the 
foot node of an auxiliary tree further spec- 
ification of the underspecified domination link. 

For illustration of our compilation, we con- 
sider an extended HPSG following the speci- 
fications in ( PS94| ) [404ff]. The rule schemata 
include rules for complementation (including 
head-subject and head-complement relations), 
head-adjunct, and filler- head relations. 

The following rule schemata cover the combi- 



nation of heads with subjects and other comple- 
ments respectively as well as the adjunct con- 
structions .0 

Head-Sub]- Schema 
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We assume a slightly modified and con- 
strained treatment of non-local dependencies 
(slash), in which empty nodes are eliminated 



We abstract from quite a number of properites 
and use the following abbreviations for feature names: 
S=SYNSEM, L=L0CAL, C=CAT, N-L=N0N-L0CAL, D=DTRS. 



and a lexical rule is used instead. While slash 
introduction is based on the standard filler- 
head schema, slash percolation is essentially 
constrained to the head spine. 

Head- Filler- Schema 
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slash termination is accounted for by a lexical 
rule, which removes an element from one of the 
valence lists (comps or subj) and adds it to the 
slash list. 

Lexical Slash- Termination- Rule 
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The percolation of slash across head domains 
is lexically determined. Most lexical items 
will be specified as having an empty slash list. 
Bridge verbs (e.g., equi verbs such as want) or 
other heads allowing extraction out of a comple- 
ment share their own slash value with the slash 
of the respective complement.^] 

3 We choose such a lexicalized approach, because it 
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Finally, we assume that rule schemata and 
principles have been compiled together (auto- 
matically or manually) to yield more specific 
subtypes of the schemata. This does not in- 
volve a loss of generalization but simply means 
a further refinement of the type hierarchy. LP 
constraints could be compiled out beforehand or 
during the compilation of TAG structures, since 
the algorithm is lexicon driven. 

3 Algorithm 
3.1 Basic Idea 

While in TAG all arguments related to a partic- 
ular functor are represented in one elementary 
tree structure, the 'functional application' in 
HPSG is distributed over the phrasal schemata, 
each of which can be viewed partial de- 
scription of a local tree. Therefore we have to 
identify which constituents in a phrasal schema 
count as functors and arguments. In TAG dif- 
ferent functor argument relations, such as head- 
complement, head- modifier etc., are represented 
in the same format as branches of a trunk pro- 
jected from a lexical anchor. As mentioned, this 
anchor is not always equivalent to the HPSG 



will allow us to maintain a restriction that every TAG 
tree resulting from the compilation must be rooted in 
a non-emtpy lexical item. The approach will account 
for extraction of complements out of complements, i.e., 
along paths corresponding to chains of government rela- 
tions. 

As far as we can see, the only limitation arising from 
the percolation of SLASH only along head-projections is 
on extraction out of adjuncts, which may be desirable 
for some languages like English. On the other hand, 
these constructions would have to be treated by multi- 
component TAGs, which are not covered by the intended 
interpretation of the compilation algorithm anyway. 



notion of a head; in a tree projected from a 
modifier, for example, a non-head (adjunct-dtr) 
counts as a functor. We therefore have to gener- 
alize over different types of daughters in HPSG 
and define a general notion of a functor. We 
compute the functor-argument structure on the 
basis of a general selection relation. Follow- 
ing ( Kas92[ )P|, we adopt the notion of a selector 
daughter (SD), which contains a selector feature 
(SF) whose value constrains the argument (or 
non-selector) daughter (non-SD)J^j For example, 
in a head-complement structure, the SD is the 
head-dtr, as it contains the list-valued feature 
comps (the SF) each of whose elements selects 
a comp-dtr, i.e., an element of the comps list is 
identified with the synsem value of a comp-dtr. 

We assume that a reduction takes place along 
with selection. Informally, this means that if 
F is the selector feature for some schema, then 
the value (or the element (s) in the list-value) of 
F that selects the non-SD(s) is not contained in 
the F value of the mother node. In case F is list- 
valued, we assume that the rest of the elements 
in the list (those that did not select any daugh- 
ter) are also contained in the F at the mother 
node. Thus we say that F has been reduced by 
the schema in question. 

The compilation algorithm assumes that 
all HPSG schemata will satisfy the condition 
of simultaneous selection and reduction, and 
that each schema reduces at least one SF. 
For the head-complement- and head-subject- 
schema, these conditions follow from the Va- 
lence Principle, and the SFs are comps and subj, 
respectively. For the head-adjunct-schema, the 
adjunct-dtr is the SD, because it selects the 
head-dtr by its mod feature. The mod feature is 
reduced, because it is a head feature, whose 
value is inherited only from the head-dtr and 



The algorithm prese nted he re extends and refines the 



approach described by ( Kas92[ ) by stating more precise 
criteria for the projection of features, for the termina- 
tion of the algorithm, and for the determination of those 
structures which should actually be used as elementary 
trees. 

5 Note that there might be mutual selection (as 
in the c ase of the specifier-head-relations proposed 
in (PS94)[44ff]). If there is mutual selection, we have 
to stipulate one of the daughters as the SD. The choice 
made would not effect the correctness of the compilation. 



not from the adjunct-dtr. Finally, for the filler- 
head-schema, the head-dtr is the SD, as it selects 
the filler-dtr by its slash value, which is bound 
off, not inherited by the mother, and therefore 
reduced. 

We now give a general description of the com- 
pilation process. Essentially, we begin with a 
lexical description and project phrases by using 
the schemata to reduce the selection informa- 
tion specified by the lexical type. 

Basic Algorithm Take a lexical type L and 
initialize by creating a node with this type. 
Add a node n dominating this node. 

For any schema S in which specified SFs of 
n are reduced, try to instantiate S with n 
corresponding to the SD of S. Add another 
node m dominating the root node of the in- 
stantiated schema. (The domination links 
are introduced to allow for the possibility 
of adjoining.) Repeat this step (each time 
with n as the root node of the tree) until 
no further reduction is possible. 

We will fill in the details below in the fol- 
lowing order: what information to raise across 
domination links (where adjoining may take 
place), how to determine auxiliary trees (and 
foot nodes), and when to terminate the projec- 
tion. 

We note that the trees produced have a trunk 
leading from the lexical anchor (node for the 
given lexical type) to the root. The nodes that 
are siblings of nodes on the trunk, the selected 
daughters, are not elaborated further and serve 
either as foot nodes or substitution nodes. 

3.2 Raising Features Across 
Domination Links 

Quite obviously, we must raise the SFs across 
domination links, since they determine the ap- 
plicability of a schema and licence the instanti- 
ation of an SD. If no SF were raised, we would 
lose all information about the saturation status 
of a functor, and the algorithm would terminate 
after the first iteration. 

There is a danger in raising more than the 
SFs. For example, the head-subject-schema 



in German would typically constrain a verbal 
head to be finite. Raising head features would 
block its application to non-finite verbs and we 
would not produce the trees required for raising- 
verb adjunction. This is again because heads in 
HPSG are not equivalent to lexical anchors in 
TAG, and that other local properties of the top 
and bottom of a domination link could differ. 
Therefore head features and other local features 
cannot, in general, be raised across domination 
links, and we assume for now that only the SFs 
are raised. 

Raising all SFs produces only fully saturated 
elementary trees and would require the root and 
foot of any auxiliary tree to share all SFs, in or- 
der to be compatible with the SF values across 
any domination links where adjoining can take 
place. This is too strong a condition and will not 
allow the resulting TAG to generate all the trees 
derivable with the given HPSG (e.g., it would 
not allow unsaturated VP complements). In 
§ [T^ we address this concern by using a multi- 
phase compilation. In the first phase, we raise 
all the SFs. 

3.3 Detecting Auxiliary Trees and Foot 
Nodes 

Traditionally, in TAG, auxiliary trees are said 
to be minimal recursive structures that have a 
foot node (at the frontier) labelled identical to 
the root. As such category labels (S,NP etc.) 
determine where an auxiliary tree can be ad- 
joined, we can informally think of these labels 
as providing selection information correspond- 
ing to the SFs of HPSG. Factoring of recur- 
sion can then be viewed as saying that auxiliary 
trees define a path (called the spine) from the 
root to the foot where the nodes at extremities 
have the same selection information. However, 
a closer look at TAG shows that this is an over- 
simplification. If we take into account the ad- 
joining constraints (or the top and bottom fea- 
ture structures), then it appears that the root 
and foot share only some selection information. 

Although the encoding of selection informa- 
tion by SFs in HPSG is somewhat different than 
that traditionally employed in TAG, we also 
adopt the notion that the extremities of the 



spine in an auxiliary tree share some part (but 
not necessarily all) of the selection information. 
Thus, once we have produced a tree, we exam- 
ine the root and the nodes in its frontier. A tree 
is an auxiliary tree if the root and some frontier 
node (which becomes the foot node) have some 
non-empty SF value in common. Initial trees 
are those that have no such frontier nodes. 
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SLASH \T\ 




E 



SUBJ < [ 
COMPS < 
SLASH |TJ 



> 



> 



SUBJ < 2 > 
COMPS < T > 
SLASH [T 

want 
(equi verb) 



In the trees shown, nodes detected as foot 
nodes are marked with *. Because of the subj 
and slash values, the head-dtr is the foot of T2 
below (anchored by an adverb) and comp-dtr is 
the foot of T3 (anchored by a raising verb). 
Note that in the tree Ti anchored by an equi- 
verb, the foot node is detected because the 
slash value is shared, although the subj is not. 
As mentioned, we assume that bridge verbs, 
i.e., verbs which allow extraction out of their 
complements, share their slash value with their 
clausal complement. 

3.4 Termination 

Returning to the basic algorithm, we will now 
consider the issue of termination, i.e., how much 
do we need to reduce as we project a tree from 
a lexical item. 

Normally, we expect a SF with a specified 
value to be reduced fully to an empty list by 



a series of applications of rule schemata. How- 
ever, note that the slash value is unspecified at 
the root of the trees T2 and T3. Of course, such 
nodes would still unify with the SD of the filler- 
head-schema (which reduces slash), but apply- 
ing this schema could lead to an infinite recur- 
sion. Applying a reduction to an unspecified 
SF is also linguistically unmotivated as it would 
imply that a functor could be applied to an ar- 
gument that it never explicitly selected. 

However, simply blocking the reduction of a 
SF whenever its value is unspecified isn't suf- 
ficient. For example, the root of T2 specifies 
the sub J to be a non-empty list. Intuitively, it 
would not be appropriate to reduce it further, 
because the lexical anchor (adverb) doesn't se- 
mantically license the subj argument itself. It 
merely constrains the modified head to have an 
unsaturated SUBJ. 
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raising verb 

To motivate our termination criterion, con- 
sider the adverb tree and the asterisked node 
(whose slash value is shared with slash at the 
root). Being a non-trunk node, it will either be 
a foot or a substitution node. In either case, 
it will eventually be unified with some node in 
another tree. If that other node has a reducible 
slash value, then we know that the reduction 
takes place in the other tree, because the slash 
value must have been raised across the domina- 
tion link where adjoining takes place. As the 
same slash (and likewise subj) value should not 
be reduced in both trees, we state our termina- 
tion criteria as follows: 

Termination Criterion The value of an SF 
F at the root node of a tree is not reduced 
further if it is an empty list, or if it is shared 
with the value of F at some non-trunk node 
in the frontier. 

Note that because of this termination crite- 
rion, the adverb tree projection will stop at this 
point. As the root shares some selector fea- 
ture values (slash and subj) with a frontier node, 
this node becomes the foot node. As observed 
above, adjoining this tree will preserve these val- 
ues across any domination links where it might 
be adjoined; and if the values stated there are 
reducible then they will be reduced in the other 
tree. While auxiliary trees allow arguments se- 
lected at the root to be realized elsewhere, it is 
never the case for initial trees that an argument 
selected at the root can be realized elsewhere, 
because by our definition of initial trees the se- 
lection of arguments is not passed on to a node 
in the frontier. 



We also obtain from this criterion a notion of 
local completeness. A tree is locally complete 
as soon as all arguments which it licenses and 
which are not licensed elsewhere are realized. 
Global completeness is guaranteed because the 
notion of "elsewhere" is only and always defined 
for auxiliary trees, which have to adjoin into an 
initial tree. 

3.5 Additional Phases 

Above, we noted that the preservation of some 
SFs along a path (realized as a path from the 
root to the foot of an auxiliary tree) does not im- 
ply that all SFs need to be preserved along that 
path. Tree Ti provides such an example, where 
a lexical item, an equi-verb, triggers the reduc- 
tion of an SF by taking a complement that is 
unsaturated for subj but never shares this value 
with one of its own SF values. 

To allow for adjoining of auxiliary trees whose 
root and foot differ in their SFs, we could pro- 
duce a number of different trees representing 
partial projections from each lexical anchor. 
Each partial projection could be produced by 
raising some subset of SFs across each domi- 
nation link, instead of raising all SFs. How- 
ever, instead of systematically raising all possi- 
ble subsets of SFs across domination links, we 
can avoid producing a vast number of these par- 
tial projections by using auxiliary trees to pro- 
vide guidance in determining when we need to 
raise only a particular subset of the SFs. 

Consider Ti whose root and foot differ in 
their SFs. From this we can infer that a subj 
SF should not always be raised across domina- 
tion links in the trees compiled from this gram- 
mar. However, it is only useful to produce a 
tree in which the subj value is not raised when 
the bottom of a domination link has both a one 
element list as value for subj and an empty comps 
list. Having an empty subj list at the top of the 
domination link would then allow for adjunction 
by trees such as Ti. 

This leads to the following multi-phase com- 
pilation algorithm. In the first phase, all SFs are 
raised. It is determined which trees are auxil- 
iary trees, and then the relationships between 
the SFs associated with the root and foot in 



these auxiliary trees are recorded. The second 
phase begins with lexical types and considers 
the application of sequences of rule schemata 
as before. However, immediately after apply- 
ing a rule schema, the features at the bottom of 
a domination link are compared with the foot 
nodes of auxiliary trees that have differing SFs 
at foot and root. Whenever the features are 
compatible with such a foot node, the SFs are 
raised according to the relationship between the 
root and foot of the auxiliary tree in question. 
This process may need to be iterated based on 
any new auxiliary trees produced in the last 
phase. 



3.6 Example Derivation 

In the following we provide a sample derivation 
for the sentence 

(I know) what Kim wants to give to Sandy. 

Most of the relevant HPSG rule schemata and 
lexical entries necessary to derive this sentence 
were already given above. For the noun phrases 
what, Kim and Sandy, and the preposition to 
no special assumptions are made. We therefore 
only add the entry for the ditransitive verb give, 
which we take to sub categorize for a subject and 
two object complements. 



Ditransitive Verb 





N-L 


SLASH 


s 




SUBJ 




L|C 








COMPS 




From this lexical entry, we can derive in the 
first phase a fully saturated initial tree by ap- 
plying first the lexical slash-termination rule, 
and then the head-complement-, head-subject 
and filler-head-rule. Substitution at the nodes 
on the frontier would yield the string what Kim 
gives to Sandy. 



T 4 



SUBJ < > 
COMPS < > 
SLASH < > 




SUBJ <\J}> 
COMPS < > 
SLASH < \T\ > 




" SUBJ 


< 


2 


> 


COMPS 


< 


3 


> 


_ SLASH 


< 


1 


> 





pp 

I 

to Sandy 



LD 



SUBJ 

COMPS 

SLASH 



< _2_ > 

< T|,[T]> 

< > 

gives 



The derivations for the trees for the ma- 
trix verb want and for the infinitival marker to 
(equivalent to a raising verb) were given above 
in the examples Ti and T3. Note that the sub J 
feature is only reduced in the former, but not in 
the latter structure. 



In the second phase we derive from the en- 
try for give another initial tree (T5) into which 
the auxiliary tree Ti for want can be adjoined 
at the topmost domination link. We also pro- 
duce a second tree with similar properties for 
the infinitive marker to (Tq). 




m 

NP 

I 

what 



SUBJ < > 
COMPS < > 
SLASH < \T\ > 



SUBJ < \T\ > 
COMPS < > 
SLASH < [7] > 



" SUBJ 


< 


2 


> 


COMPS 


< 


3 


> 


_ SLASH 


< 


1 


> 





pp 

I 

to Sandy 



LD 



SUBJ < 
COMPS < 
SLASH < > 

give 



h> 



T 6 



SUBJ < > 
COMPS < > 
SLASH < [7] > 




" SUBJ 




<[}>] 


COMPS 






_ SLASH 








to 









SUBJ [TJ 

COMPS < 

SLASH QT] 

* 



> 



By first adjoining the tree Tq at the topmost 
domination link of T5 we obtain a structure T7 
corresponding to the substring what ... to give 
to Sandy. Adjunction involves the identification 
of the foot node with the bottom of the domina- 
tion link and identification of the root with top 
of the domination link. Since the domination 
link at the root of the adjoined tree mirrors the 



properties of the adjunction site in the initial 
tree, the properties of the domination link are 
preserved. 




NP 
I 

what 



SUBJ < > 
COMPS < > 
SLASH < \T\ > 



SUBJ 

COMPS 

SLASH 



<0> 

< > 

<□> 




" SUBJ 


< 


2 


> 




COMPS 


< 


4 


> 




SLASH 


< 


1 


> 





SUBJ < \T\ > 
COMPS < > 
SLASH < \T\ > 



to 



SUBJ 


< 


2 


> 


COMPS 


< 


■i 


> 


SLASH 


< 


1 


> 





pp 

i 

to Sandy 



LD 



SUBJ 
COMPS 
SLASH 



0> 



< > 
give 

The final derivation step then involves the 
adjunction of the tree for the equi verb into 
this tree, again at the topmost domination link. 
This has the effect of inserting the substring 
Kim wants into what ... to give to Sandy. 

4 Conclusion 

We have described how HPSG specifications can 
be compiled into TAG, in a manner that is faith- 
ful to both frameworks. This algorithm has 
been implemented in Lisp and used to compile a 
significant fragment of a German HPSG. Work 
is in progress on compiling an English grammar 
developed at CSLI. 

This compilation strategy illustrates how lin- 



guistic theories other than those previously ex- 
plored within the TAG formalism can be instan- 
tiated in TAG, allowing the association of struc- 
tures with an enlarged domain of locality with 
lexical items. We have generalized the notion 
of factoring recursion in TAG, by defining aux- 
iliary trees in a way that is not only adequate 
for our purposes, but also provides a uniform 
treatment of extraction from both clausal and 
non-clausal complements (e.g., VPs) that is not 
possible in traditional TAG. 

It should be noted that the results of our com- 
pilation will not always conform to conventional 
linguistic assumptions often adopted in TAGs, 
as exemplified by the auxiliary trees produced 
for equi verbs. Also, as the algorithm does not 
currently include any downward expansion from 
complement nodes on the frontier, the resulting 
trees will sometimes be more fractioned than if 
they had been specified directly in a TAG. 

We are currently exploring the possiblity of 
compiling HPSG into an extension of the TAG 
formalism, such as D-tree grammars ( |RVW95" ) 
or the UVG-DL formalism ( Ram94| ) . These 
somewhat more powerful formalisms appear to 
be adequate for some phenomena, such as ex- 
traction out of adjuncts (recall §^) and certain 
kinds of scrambling, which our current method 
does not handle. More flexible methods of com- 
bining trees with dominance links may also lead 
to a reduction in the number of trees that must 
be produced in the second phase of our compi- 
lation. 

There are also several techniques that we ex- 
pect to lead to improved parsing efficiency of 
the resulting TAG. For instance, it is possible 
to declare specific non-SFs which can be raised, 
thereby reducing the number of useless trees 
produced during the multi-phase compilation. 
We have also developed a scheme to effectively 
organize the trees associated with lexical items. 
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